The Developers Guide to GenAI

Start with why?

GenAI will change everything

The Landscape

Generate Code

Hands up if

Generate Images

Used an AI coding assistant

Have Ghiblified a photo

The Ghibli AI Controversy

Studio Ghibli’s Stance

“An Insult to Life Itself”

— Hayao Miyazaki

Flux just isn’t the same

Don’t say the G-word

Raise you hand if you think your job will change…

  • a lot over the next 2 years.
  • a lot over the next 5 years.
  • a lot over the next 10 years.
  • Not much at all.

The Technology

To be precise: predict probabilities of the next word

Tokens != words

Temperature

Context windows

How much stuff you can put it.

Context windows

🎮 Pick Your LLM!

🔄 I/O Capabilities

  • 📥 Input modalities: text, image, audio, voice
  • 📝 Output modalities: text, image, voice
  • 🌍 Multilingual capabilities

⚙️ Technical Features

  • 🧠 Reasoning
  • 🛠 Tool Calling
  • 🔢 Structured Output

📊 Practical Factors

  • 📏 Maximum context window size
  • 💰 Price per token (input/output)
  • Response generation speed

🧠 Claude 3.7 Sonnet

🎮 Type:Reasoning & Coding

🧩 Input: Text, Image

🖼 Output: Text

📏 Context: 200K tokens

🛠 Function Calling:

🏗️ Structured Output:

🤔 Reasoning:

💻 Special Ability: Computer Use (mouse, keyboard, browser)


💪 Strengths:

  • Amazing coder
  • Caches content
  • Performs autonomous computer tasks

⚠️ Weaknesses:

  • No image output

✨ Gemini 2.0 Flash

⚡ Type: Speed & Efficiency / Long Context

🧩 Input: Text, Image, Audio, Video, Voice

🖼 Output: Text, Voice

📏 Context: 1M tokens

🛠 Function Calling:

🏗️ Structured Output:

🤔 Reasoning:


💪 Strengths:

  • Very fast response times
  • Cost-effective
  • Excellent for long context tasks
  • Good for high-volume/frequency tasks

⚠️ Weaknesses:

  • Less performant on highly complex reasoning vs Pro

🧠 DeepSeek-R1 (OS)

💡 Type: Coding & Technical Reasoning

🧩 Input: Text, Images (VL variant)

🖼 Output: Text

📏 Context: 128K tokens

🛠 Function Calling:

🏗️ Structured Output: 🔶

🤔 Reasoning:


💪 Strengths:

  • Exceptional coding & mathematical reasoning
  • Strong multilingual capabilities (Chinese+English)
  • Open-source

⚠️ Weaknesses:

  • Fewer modalities than some competitors
  • Less robust content moderation

The Tools

The GenAI Stack

APPLICATION
TOOLING
MODEL / API
PLATFORM AND STORAGE
HARDWARE
@ Sandi Besen

Visual AI Frameworks

  • LangGraph
  • FlowiseAI
  • n8n

Tools

🔍 Web Search

  • API Searches
  • News Analysis

🕸️ Web Scraping

  • Content Extraction
  • Browser Automation

📚 RAG Systems

  • Document Retrieval
  • Context Management

🗄️ Vector DBs

  • Similarity Search
  • Embedding Storage

🔌 API Clients

  • REST/GraphQL
  • Authentication

🗃️ Database

  • SQL/NoSQL
  • Data Querying

💻 Code Gen

  • Code Analysis
  • Autocompletion

🛠️ Dev Tooling

  • Git Operations
  • Execution Envs

⚙️ Shell Access

  • Command Execution
  • System Integration

📂 File System

  • File Operations
  • Data Processing

📧 Messaging

  • Email/SMS
  • Chat Platforms

🔔 Notifications

  • Push/Webhooks
  • Social Media

🎨 Image Tools

  • Generation
  • Analysis/OCR

🔊 Audio/Video

  • Speech Processing
  • Media Analysis

🧠 Reasoning

  • Chain-of-Thought
  • Logical Analysis

🧩 Planning

  • Goal Decomposition
  • Self-Reflection

Programmic AI Frameworks

🔗 LangChain 🚩
Python JS

🦙 LlamaIndex Python

🌾 Haystack Python

Pydantic Logo PydanticAI Python

Self-Hosting Options

  • CLI Ollama
  • CLI LM Studio
  • CLI LlamaCpp
  • CLI vllm
Illustration of self-hosted LLM infrastructure

Development Tools

📝 VS Code

👨‍💻 Github Copilot
📊 Cline
🦘 Roo Code

💻 IDE

Cursor
🌊 Windsurf

⌨️ CLI

🔧 Aider
🤖 Claude coder ⚠️Only Claude Sonnet

🌐 Web

🔥 Firebase Studio

Biggest Challenge: Specificity

⚠️

GenAI will fail when given tasks that are:

  • Too complex without breakdown
  • Ambiguous in requirements
  • Break down complex tasks
  • Be specific in instructions

"The quality of your output is directly proportional to the specificity of your input."

The Patterns

Techniques for Effective Prompting

  • Few-shot prompting
  • Chain of Thought
  • Tree of Thought
  • Self-Consistency
  • and many more...

Use LLMs to prompt LLMs

Development Patterns

Greenfield Development

💡

1. Idea Honing

🧩

2. Task Decomposition

🚀

3. Implementation

Existing Codebases

🔄

Incremental Iteration

🧪

For both: Lots of tests

Task Management Tools

Task Master AI

📋

Cursor Rules

🦘

Roo Code

Cost Management Strategies

Caching

📦

Batching

📊

Token Usage

MCP: The Protocol That Connects Worlds

"MCP is an open protocol that enables seamless integration between LLM applications and external data sources and tools."

— Anthropic

The Agents

Agent

Autonomously make decisions and take actions to achieve a goal

Key components of Agents

Reflection

Planning

Tools

Collaboration

Key components of Agents

Reflection

Planning

Tools

Collaboration

Types of Agents: Single Agent System

Types of Agents: Hierarchical Multi-Agent System

Types of Agents: Network Multi-Agent System

AI Agents Frameworks

🔄

LangGraph

👥

CrewAI

🤖

Autogen

🧠

Agents SDK

🛠️

Agent Development Kit

Task Length

"The more it reasons, the more unpredictable it becomes"

— Ilya Sutskever

Thank You!

Let's Connect

  • Connect with me on LinkedIn
  • Share your GenAI challenges
  • Discuss implementation strategies

Questions?

Scan to connect